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Assessing immune responses to study vaccines as surrogates of 
protection plays a central role in vaccine clinical trials. Motivated by 
three ongoing or pending HIV vaccine efficacy trials, we consider such 
surrogate endpoint assessment in a randomized placebo-controlled 
trial with case-cohort sampling of immune responses and a time to 
event endpoint. Based on the principal surrogate definition under the 
principal stratification framework proposed by Frangakis and Rubin 
[Biometrics 58 (2002) 21-29] and adapted by Gilbert and Hudgens 
(2006), we introduce estimands that measure the value of an im- 
mune response as a surrogate of protection in the context of the 
Cox proportional hazards model. The estimands are not identified 
because the immune response to vaccine is not measured in placebo 
recipients. We formulate the problem as a Cox model with missing 
covariates, and employ novel trial designs for predicting the missing 
immune responses and thereby identifying the estimands. The first 
design utilizes information from baseline predictors of the immune 
response, and bridges their relationship in the vaccine recipients to 
the placebo recipients. The second design provides a validation set for 
the unmeasured immune responses of uninfected placebo recipients 
by immunizing them with the study vaccine after trial closeout. A 
maximum estimated likelihood approach is proposed for estimation 
of the parameters. Simulated data examples are given to evaluate the 
proposed designs and study their properties. 

1. Introduction. The evaluation of vaccine efficacy in vaccine clinical tri- 
als is generally costly, either because it takes a long trial period for the clin- 
ical outcomes to be observed, or because the vaccine may only be partially 
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effective. Therefore, identifying vaccine-induced immune responses as sur- 
rogate markers for the true study endpoint has spawned interest in vaccine 
research [Halloran (1998), Chan, Wang and Heyse (2003) and Gilbert et al. 

(2005) ]. The potential surrogate would usually be measured shortly after 
administration of the study vaccine, and if it can be validated then the 
vaccine's protective effect can be inferred from it. As knowledge builds on 
the immunological mechanism for protecting against disease by a pathogen, 
finding a good immunological surrogate is promising for iteratively guiding 
refinement of the vaccine formulation, and ultimately for providing a basis 
for regulatory decisions. 

There is an extensive literature on the evaluation of surrogate endpoints 
for therapeutic development [e.g., Prentice (1989), Lin, Fleming and De Gruttola 
(1997), DeGruttola et al. (2002), Molenberghs et al. (2002) and Weir and Walley 

(2006) ]. The assessment of an immunological surrogate focuses on contrast- 
ing the clinical outcome rate between vaccine recipients and placebo recip- 
ients, given the measured immune responses. Since immune response mea- 
surements are made post-randomization, this assessment is subject to se- 
lection bias [Frangakis and Rubin (2002) and Gilbert, Bosch and Hudgens 
(2003)]. To address this problem, Gilbert and Hudgens (2006) (henceforth 
GH) proposed to evaluate the value of a biomarker as a surrogate endpoint 
by estimating the causal effect predictiveness (CEP) surface, which con- 
trasts the clinical outcome rates between the vaccine recipients and placebo 
recipients within principal strata formed by joint values of the potential im- 
mune responses under assignment to vaccine or placebo. This work built on 
Frangakis and Rubin (2002) 's potential outcomes framework for evaluating 
principal surrogate endpoints. GH considered a binary clinical outcome and 
used a baseline predictor approach to predict the principal strata and esti- 
mate the CEP surface nonparametrically. We develop a similar method for 
a time-to-event clinical endpoint, which is most commonly used in vaccine 
clinical trials, and use the Cox proportional hazards model [Cox (1972)] to 
describe the relationship between the survival outcome and covariates in- 
cluding the potential surrogate. Our likelihood calculations utilize discrete 
failure time models, which are suitable for many vaccine trials because clin- 
ical endpoints are often assessed at pre-specified dates. 

In the principal stratification framework, the principal strata are subject 
to missingness as only the immune response to the actual treatment as- 
signment (vaccine or placebo) is observed. This situation was described as 
the "fundamental challenge of causal inference" [Holland (1986)]. The unob- 
served immune response is missing for the subjects that receive the "oppo- 
site" assignment. We focus on a marginal estimand that conditions on the 
immune response to the vaccine. Consequently, the assessment of a surrogate 
in the Cox model framework can be cast as a problem of estimation with 
a missing covariate. Although methods for estimating the Cox model with 
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missing covariates have been extensively studied [e.g., Lin and Ying (1993), 
Robins, Rotnitzky and Zhao (1994), Zhou and Pepe (1995), Paik and Tsai 
(1997), Chen and Little (1999), Herring and Ibrahim (2001), Chen (2002) 
and Little and Rubin (2002)], their application to the proposed surrogate 
assessment are not direct, as the missing data are entirely in the placebo 
group. Techniques are called for to predict the "missing" immune responses 
in the placebo recipients, or a random sample of them. Therefore, we extend 
the innovative designs proposed by Follmann (2006) for a binary endpoint 
to the Cox model setting. 

Follmann (2006) proposed two novel components to vaccine trials: baseline 
irrelevant predictor (BIP), and closeout placebo vaccination (CPV), which 
enable inference about the vaccine-specific immune responses of placebo re- 
cipients. BIP utilizes association between the response of interest and an- 
other baseline immune response thought to be irrelevant to infection in the 
vaccinated subjects. CPV involves vaccinating uninfected placebo recipi- 
ents after study completion. To match ongoing and pending HIV vaccine 
trials, we extend these strategies to accommodate a time to event clini- 
cal endpoint and sampling of immune responses via a case-cohort design 
[e.g., Prentice (1986), Borgan et al. (2000), Scheike and Martinussen (2004) 
and Kulich and Lin (2004)]. We focus on a sampling design that uses data 
from all infected subjects and a random subcohort of uninfected subjects 
for whom the immune response to the vaccine is measured (termed "im- 
munogenicity subcohort," IC). The methods also apply for other sampling 
designs, such as failure status-independent case-cohort sampling. We also 
consider measuring the BIP on some subjects outside the IC, which can 
help improve efficiency. 

Under the BIP design placebo subjects cannot be selected into the IC; 
similarly, infected placebo subjects cannot enter IC in the CPV design. Such 
null selection probabilities violate a key assumption for most semiparametric 
approaches to handling missing covariates in Cox regression, including all 
that are based on partial likelihood. Accordingly, we employ a full-likelihood 
based estimation procedure based on DFT models. For continuous failure 
time data, we also consider an approximate semiparametric algorithm for 
the estimation of the BIP-alone design by extending the EM algorithm of 
Chen (2002). 

The proposed methods will be applied to analyze three U.S. National 
Institutes of Health-sponsored HIV vaccine efficacy trials. These trials ran- 
domize HIV negative high risk volunteers to vaccine or placebo in a 1:1 ratio, 
and follow participants until a fixed number of HIV infection events. The first 
two trials (named STEP 502 [Mehrotra, Li and Gilbert (2006)] and HVTN 
503) are ongoing in the Americas and South Africa, respectively, and eval- 
uate Merck's Adenovirus serotype 5 (Ad5) vector vaccine in approximately 
3000 subjects. The third trial (named PAVE-100), co-sponsored by the U.S. 
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Military HIV Research Program, the International AIDS Vaccine Initia- 
tive, and the Centers for Disease Control and Prevention, is being planned. 
The current PAVE-100 design will randomize approximately 8500 volun- 
teers from 13 countries in the Americas, East Africa, and Southern Africa 
to placebo or the Vaccine Research Center's prime-boost vaccine regimen 
(DNA prime: Ad5 vector boost). The trials plan to analyze approximately 
100, 120 and 280 HIV infection events, respectively. A secondary objective 
of each trial is to evaluate the magnitude of CD8 + T cell response levels, 
as measured by the ELISpot assay from blood samples drawn after Ad5 
immunization, as a surrogate for HIV infection. The neutralizing antibody 
titer to Ad5 is measured at baseline for all participants. Because it is in- 
versely correlated with the CD8 + T cell responses [Catanzaro et al. (2006)] , 
it potentially may be used as a BIP. 

To develop our approach for assessing surrogate endpoints in vaccine tri- 
als, we present the general framework, assumptions, and definition of the 
estimands in Section 2, design considerations in Section 3, and an estima- 
tion procedure in Section 4. In Section 5 we evaluate the approach with 
simulated trials designed to match the aforementioned HIV trials. A discus- 
sion follows in Section 6. 

2. The principal stratification framework. In this section we introduce 
the principal stratification framework based on potential outcomes and prin- 
cipal stratification [Frangakis and Rubin (2002) and Rubin (2005)]. 

Let n denote the total number of subjects in the vaccine trial. For subject 
i {i = 1, . . . , n), let Vi denote the observed treatment indicator, W{ denote a 
collection of first-phase baseline covariates in the case-cohort sampling (mea- 
sured on everyone), and Si(V) denote the potential immune response of the 
subject if he/she is assigned vaccine (V = 1) or placebo (V = 0). Similarly, 
for V = 1,0, let Ti(V) and Ci(V) be the potential failure time and censor- 
ing time, and X^V) = mm{Ti(V), and £j(V) = I(Ti(V) < C^V)). 
Let ti, . . . ,tx indicate the fixed visit times, with t2, ■ ■ ■ , tx the possible dis- 
crete failure times for Xi(Vi). Let denote censored at the final visit and 
Mi denote the last visit number of subject i during the trial period, thus, 
Mi G {1, . . . ,K}. For vaccine recipients at-risk at t\ and in the IC, the im- 
mune response Si(V) is measured at time t\. Letting Ri(V) denote the 
potential at-risk indicator at ti, Si(V) is only defined if Ri(V) = 1; other- 
wise, we put Si (V) = *. We assume that the censoring process Ci(V) and 
failure time distribution Tj(V) are independent given {Wj, Ri(V), Si(V)}. 

Supposethat{^,Wi,12i(0),i2i(l),5 i (0),5 i (l),X i (0),X i (l),5 i (0),iy i (l),i = 
1, . . . , n} are i.i.d. We make the following assumptions to identify the esti- 
mands: 

Al. Stable unit treatment value assumption (SUTVA). 
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A2. Ignorable treatment assignments. Conditional on W,, Vi is independent 
of {R i (0),R l (l),Si(0),S l (l),X i (0),X l (l),5i(0),5 l (l)}. 

Assumption Al guarantees the "consistency" property (i.e., the observed 
outcomes for a subject assigned V equals his potential outcomes if assigned 
V) and that the potential outcomes of one subject are not impacted by the 
treatment assignments of other subjects. A2 holds for randomized, blinded 
trials. 

Under the above assumptions, we define two vaccine efficacy estimands: 
1. Conditional on joint potential outcomes {joint VE) 

VE( Sl ,s ) 

Pr(T(l) = t k \T(l) > t fc _!, 5(1) = 8i, 5(0) = s ,R(l) = 1, R(0) = 1) 
Pr(T(0) = t k \T(0) > t k -i, 5(1) = si, 5(0) = s , R(l) = 1,R(0) = 1) ' 
2. Conditional on marginal potential outcome (marginal VE) 

VF( ? "1 = 1 — Pr{T{1) = tfc|T(1) ~ *±± = SuR{1) = 1} 
1 lj - Pr(T(0)=t fc |T(0)>t fc _ 1 ,5(l) = Sl ,fl(l) = l)' 

k = 2,...,K. 

The estimand VE(si,sq) conditions on membership in the basic princi- 
pal stratum {5(1) = s\, 5(0) = sq, R(l) = R(0) = 1}, and VE(s\) conditions 
on membership in a union of basic principal strata [Frangakis and Rubin 
(2002)]. The estimands condition on Ri(l) = Ri(0) = 1 or on Ri(l) = 1 be- 
cause Si(V) is only defined if RiiV) = 1, V = 0, 1. The estimands are prin- 
cipal stratification estimands in that the pair (5(1), 5(0)) or 5(1) can be 
treated as a baseline covariate. However, they are not causal estimands, 
because the numerators and denominators condition on different events 
T(l) > and T(0) > Nevertheless they are scientifically interest- 

ing, in the same way that a hazard ratio conditional on baseline covariates 
is interesting. 

To help identify the estimands, only subjects with -Ri (Vi) = 1 are included 
in the analysis, and we assume the following: 

A3. Equal drop-out and risk up to time t\\ Ri(l) = 1 <^=> Ri(0) = 1. 

A3 implies that subjects observed to be at risk at t\ will have Ri(l) = 
Ri(0) = 1, so that 5j(l) and 5j(0) are both defined. 

In addition to A1-A3, identifiability of VE(si,sq) requires a way to pre- 
dict 5j(l) for subjects with Vi = and a way to predict 5j(0) for sub- 
jects with V, = 1. Identifiability of VE(s\) is easier because only the 5j(l) 
for subjects in arm Vi = must be predicted. Furthermore, for our mo- 
tivating application, typically the immune response 5j(0) is zero for all 
placebo recipients, because exposure to the vaccine is necessary to stim- 
ulate an immune response. For these reasons, henceforth, we focus on the 
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marginal estimand VE(s\). Note that, for applications with 5j(0) = for all 
i, VE( Sl )=VE( Sl ,0). 

We propose a Cox model for the discrete cumulative hazard function A(i), 

dA(t k ;V,S(l) = si,R(l) = l,W)=exp(Z'(3)dA (t k ), 

(1) 

k = 2,...,K, 

with Z = {V,S(1),VS(1),W'}', p = and A (-) is the dis- 

crete baseline cumulative hazard function. The marginal VE{s\) can be 
expressed as 

Vjp( s 1 dA(t k ;V = l,S(l) = Sl ,R(l) = l) 

The discrete hazards always condition on {R(l) = 1} and, henceforth, we 
assume this implicitly. For subjects with a particular baseline covariate w, 
a similar estimand VE(si\w) can be formed by conditioning on W = w in 
the hazards. 

The population estimand VE(s\) contrasts the rate of the clinical event 
for subjects with 5(1) = s\ under assignment to vaccine versus under as- 
signment to placebo. Supposing 5(1) is bounded below at value zero which 
indicates a negative immune response, we define 5 to be a predictive surro- 
gate if VE(0) = and VE(s\) > for all s\ > C for some constant C > 0. 
These conditions reflect population level necessity and sufficiency of the im- 
mune response to achieve positive vaccine efficacy. 

Under A1-A3 and the Cox model (1), the estimand equals 

(2) 7£(s 1 ) = l-exp(A + ai&). 

In equation (2) a negative value of (3% indicates that a higher immune re- 
sponse to vaccine predicts greater vaccine efficacy. On the other hand, (3% = 
implies VE(s%) is constant in si so that the marker does not predict vaccine 
efficacy. Therefore, testing Hq : (3^ = versus H\ : (3% < assesses sufficiency. 
A value (3\ = indicates necessity, and both (3\ = and f3% < indicate the 
marker is a predictive surrogate. The magnitude of (3$ indicates the qual- 
ity of the predictive surrogate with = suggesting no surrogate value 
[VE(si) is constant in s±] and larger \f3^\ suggesting greater surrogate value 
(greater predictiveness) . 



3. Augmented designs for estimation. The immune response to the study 
vaccine, 5(1), cannot be measured in placebo recipients, but it may be in- 
ferred when utilizing either the BIP or CPV designs (see Figure 1). 
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Fig. 1. Illustration of an HIV vaccine trial design under the BIP and CPV strategies. 
Under BIP or BIP+ CPV, baseline measurements ofW and B are obtained from all (or a 
random sample of) the study participants prior to the randomization at time 0. The study 
subjects are then randomized to receive inoculation V of the study vaccine or placebo. For 
some vaccine recipients, the immune response to the vaccine S{\) is measured at time ti. 
The subsequent assessments of HIV infection are conducted at discrete times t-2,...,tic- 
The study subjects are followed until diagnosis of HIV infection (HIV+ ) or study eloseout 
at or after tK- Under CPV or BIP+ CPV, placebo recipients uninfected (HIV—) at study 
eloseout (or a random sample of them) are immunized with the study vaccine and the 
immune response S c (l) is measured ti units of time afterward. 



Baseline Irrelevant Predictor (BIP). Assume a baseline covariate B is 
available that does not affect (i.e., is "irrelevant" for) clinical risk after ac- 
counting for the immune response S(l) and first-phase covariates W: 

A4. dA(t k ;V,S(l),W,B) = dA(t k ;V,S(l),W), k = 2, . . . ,K, V = 0, 1. 

Assumptions A1-A3 imply that the relationship between S(l) and B is 
the same regardless of treatment assignment 

(3) [Si(l)\Vi = l,Bi,Ri(l) = l] = [^(1)1^ = 0,^,^(1) = 1]. 
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Therefore, Si(l) can be predicted or imputed for placebo subjects based on 
B{. For vaccine recipients with the BIP measured and who are outside the 
IC, their immune responses are predicted using the BIP as well. 

In case-cohort designs, good baseline predictors need to be highly corre- 
lated with the biomarker S(l), and preferably include first-phase (measured 
on everyone) inexpensive covariates to achieve efficiency gains. 

Closeout Placebo Vaccination (CPV). This design entails vaccinating 
uninfected placebo subjects after the study closeout, and measuring their im- 
mune response Sf(l). The closeout measurement Sf(l) is made at a visit t± 
time units after vaccination, to match the measurement schedule in the vac- 
cine trial. We need to make an additional assumption to bridge the marker 
values Si(l) and Sf(l). Let 5| rue (l) be the true immune response at time 
t\, allowing that the observed immune response is subject to some assay 
measurement error. 

A5. Time constancy of 5* rue (l): For uninfected placebo recipients, <Si(l) = 
Sj rue (l) + en and S£(l) = 5* ruc (l) + e i2 , where en and e a are indepen- 
dent and identically distributed random errors with mean 0. 

This assumption implies that the true immune response is unchanged 
from time t\ to study closeout plus ti, and the measurement errors have the 
same distribution. Thus, Si(l) and Sf(l) are exchangeable and one can be 
used in lieu of the other. To be concrete, suppose only one shot is given, 
the trial is three years, and t\ is 6 months after the shot. A5 states that 
the true immune response 6 months after the shot is the same whether it is 
measured January 1, 2004 or January 1, 2007. In the Discussion we outline 
how our methods can be generalized to use S^ rue (l) in the Cox model (1) 
rather than Si(l). Note that even if the regression involves ,S* rue (l), a valid 
test of the effect of 5f uc (l) obtains when using S^l) [Prentice (1982)]. If 
time constancy of immune response is not reasonable, then Sf (1) cannot be 
used in lieu of Si(l) and CPV may be questionable. See Follmann (2006) for 
further discussion of this issue, including how to examine this assumption. 

Under A5, the distribution of [<Si(l)|Vi = 0, Si = 1] can be inferred from 

the marginal distributions [Si(l)|T^ = 1] = [^(l)^ = 0]. However, in case- 
cohort sampling, if the IC is small, then the large amount of missing data 
and the inferred immune responses in placebo recipients may challenge the 
performance of the method. 

Baseline irrelevant predictor and closeout placebo vaccination combined 
(BIP + CPV). The BIP and CPV designs can be combined by imputing 
Si(l) with Sf(l) for all uninfected placebo recipients with Sf(l) measured, 
and predicting Si(l) with Bi for all others with Bi measured. Combining 
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the designs can yield large efficiency gains. In the situation where there is 
no good baseline predictor or the baseline predictor is expensive to collect, 
conducting small-scale CPV on a random sample of the uninfected placebo 
recipients can add accuracy and precision to the estimates. 

4. Estimation. Estimation of the estimand is challenged by the amount 
of missing S(l)'s. We focus on the maximum estimated likelihood (MEL) 
estimation procedure that applies to all three designs. We then briefly out- 
line an approximate EM-type algorithm for estimation with the BIP-alone 
design. 

4.1. Maximum estimated likelihood estimation. We present below the es- 
timation procedure for the BIP + CPV design, which includes estimation 
under the BIP- or CPV-alone designs as special cases. 

Let ICy denote the immunogenicity cohort that contributes second-phase 
data S(l) in vaccine recipients, and IC p denote the cohort within uninfected 
placebo subjects that received vaccination at study closeout, so that IC = 
ICyUlC p. Let IB denote the set of subjects with B measured, which can be 
larger than IC. For placebo subjects that do not have S(l) measured, their 
likelihood contribution integrates over the marginal distribution of S(l) or 
the conditional distribution of S(1)\B. The full log-likelihood of model (1) 
under the BIP + CPV design (with convention that n}=2 = 1) is given by 



logL(/3,A )= logLi(0<)+ E hgL 2 (Oi)+ £ log £ 3 (Oi) 



ie/cv 



i&ICp 



ieIC,IB 



(4) 



ieicjB 



where 



Li(Oi) 




L 2 {Oi) 





Mi-l 



J'=2 



X 
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x (1 - A ,A/J cxp{ ^ /3l+s/32+v/lS/33+H/ ^ }(1 -' 5l) ^^ ) dP(s|i? i ,^), 

f Mi-l 

x {i - (i - A M f^{ v ^+ s ^+ v i s ^+ w iM} s ^ { y^ 

Here Ao = {A02, • • • , Aox} T are unknown baseline hazards (with Ao/c = ^Ao(tfc), 
k = 2, ... ,K), and P(s\w) and P(s\b,w) are the conditional c.d.f.'s of 5(1). 

In the Cox model formulation, the estimand VE(si) depends only on 
(3 while the parameters in the conditional c.d.f.'s P(s\w) and P(s\b,w) 
are nuisance parameters. Rather than maximizing the full likelihood over 
the entire parameter space, we take the MEL approach [Pepe and Fleming 
(1991)] to avoid specifying the joint distribution of (5(1), B, W) and the 
intensive computations entailed in the numerical integration. The condi- 
tional c.d.f.'s P{s\w) and P(s\b,w) are first consistently estimated from the 
vaccine recipients' data (Section 4.1.1), and then the estimated likelihood 
logL(/3, A, P(-),P(-|-)) is constructed. 

For a categorical W, P(s\w) and P(s\b,w) can be estimated nonpara- 
metrically. However, if W is continuous, then nonparametric estimation will 
require smoothing and much larger sample sizes are needed for tractable 
computation. Therefore, if W is continuous or multi-component, parametric 
assumptions on the conditional c.d.f.'s will usually be needed to achieve sta- 
ble estimation in practice. An advantage of the MEL approach is that it can 
straightforwardly accommodate any approach to estimating the nuisance 
parameters P(s\w) and P(s\b,w). In the MEL approach we first estimate 
these distributions consistently using data from the vaccine recipients, and 
then construct the estimated likelihood L(/3, A, P(-), P(-|-)). 

We outline three key steps in the evaluation of the log-likelihood (4) in 
the absence of the first-phase covariates W: 

1. Estimation of p(s) and p(s\b). Let p(s), p(b), and p(s,b) be marginal 
and joint p.d.f.s (or p.m.f.s for discrete variables) for 5(1) and B. Because 
vaccine recipients in the ICy provide nonrandom samples of 5(1) and B, 
and vaccine recipients in the IB contribute additional data for B, it follows 
that 

P(s) = fn{s)pu + /io(s)pio, 

(5) P(b)=f n (b)p n + f 10 (b)p w , 

P(s,b) = fii(s,b)p n + f 10 (s,b)p 10 , 

where, for h = 1, 0, fih(') is the conditional p.d.f. or p.m.f. of 5(1) given V = 
1 and 5 = h, and p\h = Pr(<5 = h\V = 1). The probabilities {pih} can be esti- 
mated by their sample counterparts {pih} and estimates of {fih{s),fih(b),fih(s, b)}. 
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We sketch the estimation for two special cases where (A) (S(l), B) are 
categorical and (B) (S(1),B) are bivariate normally distributed. 

(A) If S(l) and B have discrete values with J and L categories, respec- 
tively, then fih(sj) and p(S(l) = Sj\bi) (j = 1, . . . , J, I = 1, . . . ,L) can be es- 
timated nonparametrically: 



flh(Sj] 



EiciCyHSii 1 ) = Sj,5j = h) 



p(S(l) = Sj \bi) = ^ l - p n 

2^ieIC v ,Bi=bi °i 

, Eieic v .B i =b l ^-Si)I(S i (l) = s j ) ^ 
^ T\ T\ Pl °- 

2^i£lC v ,Bi=bi\ l ~ °i) 

(B) If (S(1),B) are jointly normally distributed, then p(s) and p(s\b) are 
both normal densities and thus can be estimated using estimates of the first 
and second moments from expressions in (5). 

Evaluating the likelihood (4) involves integrations over s, which are briefly 
described in the Appendix. 

2. Maximization and implementation. The estimated log-likelihood 
logL(/3, X,P(-),P(-\-)) is maximized using quasi-Newton methods. The as- 
sumption that 5(1) is observed with nonzero probability in all subjects is 
violated. Therefore, the asymptotic variance of (3 via the MEL approach 
cannot be derived analytically. We propose to obtain the standard errors for 
(3 by the bootstrap. For computational efficiency, the software for estima- 
tion is implemented in Matlab 7.0.1 (Mathworks, Inc) with a C++ plug in, 
compiled to dynamic link library. 



4.2. Approximate EM-type estimation. In this subsection we present an 
estimation approach that uses regression calibration to impute the missing 
Si{l)s for subjects with a BIP Bi measured and employs an EM-type algo- 
rithm based on full likelihood to accommodate the missing <S'i(l)s for sub- 
jects without Bi measured. Because the CPV-based designs have missing 
S'(l)s for the entire {V = 0,5 = 1} stratum, the algorithm can only reliably 
estimate the Cox model parameters for the BIP-alone design, as confirmed 
in simulations. We focus on the BIP-alone design with a continuous BIP in 
this section. The proposed algorithm can be applied to a categorical BIP 
with slight modification. An advantage of this EM approach is that it ac- 
commodates continuous failure times. 

Because the missingness of S(l) does not depend on unobserved S(l), 
and we assume the censoring distribution does not depend on S(l), the log- 
likelihood for the BIP-alone design can be expressed up to a constant factor 
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as 

/(/3,a,A ) 

= y £{6 i (Z' i P)-A (X i )exp(ZiP)} 

ieic 

+ J2 log|y'e^{J i (Z^)-A (XOexp(Z^)}dP(s|V< J Wi > J3 i ) 

ielcjB 

+ ^{Je^{S i (Zil3)-A (X i )exp(Z' i l3)}dP(8\Vi,W i ) 

ieIC,IB 

+ S l log(dA (X i )), 

where denotes the observed failure time, Aq(X) denotes the baseline 
cumulative hazard function, and a represents unknown parameters in the 
conditional distributions of 5(1). 

The log-likelihood score equations can be solved via an iterative EM al- 
gorithm [Chen and Little (1999), Herring and Ibrahim (2001), Chen (2002)]. 
For computational efficiency, we propose to modify the double-semiparametric 
EM-algorithm of Chen (2002) to incorporate the auxiliary covariate B as 
a predictor of the missing 5(1). Given equation (3) and the relationship 
iSj(l) = g(Bi\Q) + £j, where g(-) is a parametric link function depending on 
the unknown parameter 6 and ei has mean zero and variance a 2 , <Si(l) can 
be predicted by E{S{l)\Bi) = g(Bi\6). When the event occurrence is rare, 
E(S(l)\Bi) ?3 E(S(l)\Bi,Xi,6i). This fact has been well studied in the con- 
text of regression calibration in the Cox regression [e.g., Prentice (1982) 
and Wang et al. (1997)]. Therefore, unobserved 5(l)'s can be replaced by 
~E(S(1)\B) and treated as observed data in the EM algorithm. We name this 
procedure the "Approximate Calibration-Based EM (ACEM)" algorithm. 
An outline of this procedure is given below; interested readers are referred 
to Chen (2002) for details: 

1. Calibration-step: Prediction of unobserved 5j(l)s by 5j(l) = E(S(l)\Bi). 

2. E-step: Given parameter values at the mth iteration (f3^ m \ Aq"^(X), a( m ), 

Pffi i^ m ')i f° r Pkij denote the probability mass of the observed dis- 
tinct values of 5(1) at discrete levels of V = and = wi (W = 
Wd U W c where Wd and W c denote the categorical and continuous co- 
variates in W, resp.), and denote the parameters in the distri- 

bution P(W c \S(l),V,Wd,X,5). Calculate conditional expectations under 
P(S(l)\V,W d ,X,5). 

3. M-step: Update (/3,Ao(X),a,pkij,6) by solving the corresponding score 
equations. 
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4. Repeat the E-step and M-step above until convergence. 

The advantage of the ACEM algorithm is that it can account for continu- 
ous failure times and is computationally fast; however, since it uses regression 
calibration, it performs well only for the rare event situation with a highly 
predictive BIP. Prevention trials, which usually have a low event rate, are 
an application area. 

5. Simulation study. We conducted a simulation study to evaluate the 
performance of the proposed strategies for estimating the estimand VE{s\) 
and thereby assessing a predictive surrogate in the Cox model setting. To 
simulate the real scenarios, we roughly follow the design of the three HIV 
vaccine efficacy trials described in the introduction. We suppose a total 
sample size of 5000, with 2500 subjects per arm. The treatment indicator 
V = 1 if assigned vaccine and V = if assigned placebo. Under the case- 
cohort sampling, the immunogenicity subcohort (IC) consists of all infected 
vaccine recipients and a random sample of uninfected vaccine recipients, 
which include a combination of 25% or 50% of uninfected vaccine recipients. 
We considered one auxiliary covariate B as the BIP for the potential im- 
munological surrogate S(l). The variables S(l) and B were generated from 
a bivariate normal distribution with mean zero and variance 0.4 for each 
component [reflecting the variance of the ELISPOT assay used to measure 
S(l) = CD8 + T cell response], and correlation p = 0.6 or 0.9. For the BIP- 
alone and BIP + CPV designs, we assume that B was measured from all 
individuals in the IC and from 50% or 37.5% of those not in the IC, as a 
precision factor. In the BIP-alone approach, S(l) was treated as missing for 
all placebo recipients, while for the BIP + CPV and CPV-alone approaches, 
we assume 25% or 50% uninfected placebo recipients got the CPV measure- 
ment S c (l). Infection times were generated from the continuous-time Cox 
model \(t\V,S(l))=\ (t)exp{p 1 V + (3 2 S(l) + P 3 VS(l)}, and were grouped 
into 6 equal-length time intervals to reflect the discrete visit schedule of 
the trials. The true parameters P2 = —1.109 and (3% were set at 0, —0.4, or 
—0.7, reflecting the null hypothesis that S(l) has no value as a predictive 
surrogate and alternative hypotheses of 1.2-fold and 1.5-fold lower relative 
risks RR(S(1)) = 1 — VE(S(1)) per 1 standard deviation higher immune re- 
sponse 5(1), corresponding to low and high surrogate value, respectively. 
In addition, \o(t) = Xq and Pi were calibrated to give VE(0) = 0.5 and 334 
infections expected in the placebo arm, and hence, 7% overall infection rate. 
Random censoring of 10% was added to account for subject drop out. All 
uninfected subjects were censored at the end of the follow-up period, spec- 
ified at 3 years. Five hundred simulation runs and 50 bootstrap replicates 
were used to obtain standard error estimates for the estimated regression 
parameters. 
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We first conducted estimation through the MEL algorithm for discrete 
failure times using all three designs. For the BIP-alone design, a second 
simulation was conducted to compare the performance of the MEL approach 
for grouped failure times, versus that of the ACEM algorithm assuming 
continuous failure times were observed in a rare event setting. To evaluate 
efficiencies for the parameter estimates, estimates from the Cox model using 
the full simulated data were obtained as an unattainable "gold standard." 

Figure 2 plots the true VE{s\) curve for different true parameters 
in model (2). It shows that when (3 3 = -0.7, VE(0) = and VE( Sl ) > for 
s\ > 0, indicating that the immune response variable is a predictive surro- 
gate. 

Table 1 presents simulation results for the MEL approach in different set- 
tings. It can be seen that the method has excellent performance. There are 
generally small biases, small variances of the estimates and good power of 
the test of Hq : (3$ = for surrogate value. As more CPV or auxiliary BIP 
information is available, both the accuracy and precision of the estimates 
improve. The efficiency of the BIP-involved designs increases as the corre- 
lation between the BIP and 5(1) increases. The CPV-alone design is less 




12 3 4 

S(1) = s, 

Fig. 2. Illustration of the estimand VE(si) as a function of the standardized potential 
surrogate S(l) over the range of observable values with different values for [3$. 



Table 1 
Results from the MEL estimation 
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Table 1 
( Continued) 
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Note. /3 (()) = 
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1.109, 
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average of the bootstrap standard error from 50 bootstrap samples; RE = relative efficiency (ASE(gold standard) 2 /ASE (missing) 2 ) x 
100%; Power is for testing Ho : 03 = 0. "Large Missing" and "Medium Missing" patterns indicate the IC size of 25% or 50% with additional 
25% or 37.5% BIP data for designs with BIP, and include closeout S c (l) data from 25% or 50% uninfected placebo recipients for designs 
with CPV, respectively. 
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Fig. 3. Relative efficiencies of parameter estimators. For designs with BIP, 11 Large Miss- 
ing" and "Medium Missing" patterns indicate the IC size of 25% or 50% with addi- 
tional 25%o or 37.5%o BIP data, respectively. For the design with CPV, "Large missing" 
and "Medium missing" patterns include closeout S c (l) data from 25% or 50% uninfected 
placebo recipients, respectively. True values of (3 (/3^,/3' 4 \/3^) are as specified in Tables 
1 and 2. 



efficient because none of the infected placebo subjects have 5 C (1) measured. 
Figure 3 displays the relative efficiencies of the parameter estimators from 
the three designs with missing 5(1) with respect to the gold standard esti- 
mators. Overall the relative efficiency increases as the amount of measured 
immune responses increases. The relative efficiency of $2 is largely impacted 
by the amount of missing data, while that of (3\ is less sensitive to the missing 
data pattern. These results confirm our design assumptions quite well. 

Table 2 lists results from both the MEL approach and the ACEM algo- 
rithm under the BIP-alone design and the medium missing case (the IC size 
of 50% with additional 37.5% first phase BIP data). It demonstrates that 
the performance of the ACEM method is very sensitive to the prediction 
accuracy of the baseline predictor. When the BIP is a fairly inaccurate pre- 
dictor of 5(1) (p = 0.6), the ACEM method produces large biases and does 



Table 2 

Comparison of results between the MEL and ACEM approaches for the BIP-alone design with the "Medium Missing" pattern (the IC 

size of 50% with additional 37.5% BIP data) 
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Note. ((j) = (-0.693,-1.109,0); /3 (4) = (-0.849,-1.109,-0.4); (7) = (-0.996,-1.109,-0.7). SE = Monte Carlo standard error, ASE = 
average of the bootstrap standard error from 50 bootstrap samples; RE = relative efficiency (ASE(gold standard) 2 /ASE (missing) 2 ) x 
100%; Power is for testing H :(3 :i = 0. 
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not control the type-I error rate; while if the calibration is reliable {p = 0.9), 
then the ACEM algorithm can generally estimate well. The MEL estimation 
outperforms the ACEM in most settings, with a slight loss of efficiency due 
to the grouping of the survival times. 

6. Discussion. We have proposed a framework for assessing an immuno- 
logical predictive surrogate in a vaccine trial with a time to event endpoint 
and case-cohort sampling of the immunological biomarker. While we have 
focused on the methods development for vaccine trials, the proposed prin- 
ciples are applicable for evaluating predictive surrogate endpoints in other 
biomedical applications. 

We have discussed study designs and estimation procedures, and provided 
simulation results to demonstrate their validity and applicability under as- 
sumptions. We plan to apply the BIP-alone design to the three ongoing 
or pending HIV vaccine efficacy trials. As demonstrated by the simulation 
study, if good baseline irrelevant predictors exist, then a predictive surrogate 
can be evaluated effectively. The CPV-alone design is also a useful tool for 
the assessment that is complimentary to the BIP-alone design. If resources 
permit, the BIP + CPV design merits consideration because it improves ac- 
curacy and efficiency compared to the BIP-alone design if baseline predictors 
are not closely correlated with the potential predictive surrogate, or if A4 
appears to be violated (i.e., the BIP affects clinical risk after controlling for 
the potential surrogate and first-phase baseline covariates). 

For simplicity, we assumed equal drop-out and risk for each subject under 
assignment to vaccine or placebo over the time interval [0, t\] (assumption 
A3), and restricted the analysis to subjects at risk at the time the immune 
response is measured, t\. To include all randomized subjects, A3 can be re- 
laxed by postulating that the future immune response that will be measured 
at time t± impacts the risk of infection over [0, t\]. With the DFT Cox model 
(1), the likelihood contribution of a subject with early infection during [0, t±] 
can be obtained as J{1 - (1 - \ m f^{ v ^+ s ^+ v i s ^+ w iM}dP{s), where 
P(-) is the marginal (P(s)) or conditional distribution of S(l) (P(s\Bi)) if 
the BIP Bi is measured. Another way to potentially weaken A3 would be 
to assume equal infection probabilities in [0, ti] for the vaccine and placebo 
groups, but not require that the vaccine has no effect for every individual. 

A4 is a strong untestable assumption. Because we assume B and S(l) are 
correlated, A4 implies that the phase one covariates W capture all the causes 
of S(l) and the clinical endpoint [in the sense of Pearl (2000)]. Furthermore, 
it may be difficult to find a baseline covariate B that is known to not affect 
clinical risk after accounting for 5(1). We suggest three potentially useful 
£Ts for vaccine trials. First, a study that vaccinated 75 individuals simulta- 
neously with hepatitis A and B vaccines showed a linear correlation of 0.85 
among A- and B-specific antibody titers [Czeschinski, Binding and Witting 
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(2000)]. Given there is little cross-reactivity among the hepatitis A and B 
proteins, B = hepatitis A titer may be an excellent baseline predictor for 
5(1) = hepatitis B titer that satisfies A4. For HIV vaccine trials, two avail- 
able scalar .B's may plausibly satisfy A4. First, Follmann (2006) considered 
as B the antibody titer to a rabies glycoprotein vaccine. Because rabies is 
not acquired sexually, it is plausible that anti-rabies antibodies are inde- 
pendent of risk of HIV infection given S(l). Second, in the ongoing HIV 
vaccine efficacy trials, a current leading candidate B is the titer of antibod- 
ies that neutralize the Adenovirus serotype 5 vector that carries the HIV 
genes in the vaccine. This B has been shown to inversely correlate with 
the S(l) of primary interest (T cell response levels measured by ELISpot) 
[Catanzaro et al. (2006)], and since Adenovirus 5 is a respiratory infection 
virus, A4 may plausibly hold. 

In general, though, it is desirable to relax A4, and fortunately this can 
be done by including B as a component of W in the Cox model (1) and 
estimating its coefficient (as suggested by the Associate Editor). This extra 
coefficient for B is identified by the data from vaccine recipients with B 
measured. Based on the argument given by Follmann (2006) and Gilbert and 
Hudgens (2006) for the setting of the BIP-alone design and a dichotomous 
clinical endpoint, we conjecture that the estimand VE(s±) will be identified 
from the observed data as long as at least one of the interaction terms of B 
with V or W with V is omitted from the Cox model. 

Our approach specified a Cox regression with 5,(1) and Si(l)V as covari- 
ates. Another approach is to assume that the immune response is measured 
with some "error," S*(l) = 5f uc (l) + en and Sf(l) = 5| r uc (l) + e i2 (as is done 
in A5), but then to use the true immune responses 5* rue (l) and Sl Iue (l)V 
as covariates in the Cox model. To proceed with this model, one could 
obtain replicates of and Sf(l), say, Sn(l), S^l) an d 5^(1), S^(l), 

and assume that the e^s followed a Gaussian distribution with mean and 
unknown variance r 2 . Then a more complicated likelihood could be writ- 
ten by integrating Sf uc (l) over the distribution of ( Sj rue (l)|5 i i(l),5i 2 (l), 
5f ue (l)|^(l),5^(l), or 5f uc (l)|^ as appropriate. 

We have presented estimated likelihood based methods to accommodate 
missing data in case-cohort designs, as well as a regression calibration based 
double-semiparametric EM algorithm that has reasonable performance when 
the regression calibration is reliable and the event is rare. This approximate 
algorithm enjoys the convenience of regression calibration to incorporate 
auxiliary information, and has faster and easier implementation for the con- 
tinuous failure time model. Alternative estimation methods such as multiple 
imputation may also be useful, provided the posterior distribution can be 
properly specified. In addition, a full likelihood approach that maximized 
over (J3, Xq) and the parameters of p(s\w) and p(s\b,w) all at once could 
be used. While the full likelihood should be more efficient if the entire joint 
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model is correctly specified, MEL is simpler to implement and may be more 
robust to joint model mis-specification. 

APPENDIX: INTEGRAL CALCULATION IN LIKELIHOOD (??) 

For discrete S(l) and B, the integrations can be replaced by finite sum- 
mations. When S(l) is continuous, the integrations can be made easier by 
positing parametric models. Assume 5(1) ~ N(^(-), cr(-) 2 ), where /i(-),c(-) 2 
represent the first two moments of p(s) or p(s\b). Then for a given function 
g(s) of s, J g(s)p(-) ds = J g{^{-) + a(-)u)(j)(u) du, where p(-) denotes p(s) or 
p(s\b) and (f>{u) is the standard normal density function. Because the in- 
tegrand g(s) in (4) is a smooth function of s, numerical methods such as 
Gaussian quadrature can be applied to evaluate the integration. Based on 
our experience, only a small number (around 15) of evaluations is needed to 
get stable quadrature results. 

When B has discrete values bi, I = 1, . . . ,L, an alternative way to integrate 
over s is through the nonparametric representation of p(s) and p(s\b). The 
integrals / g(s)p(-)ds can be evaluated nonparametrically by 



/ g(s)p(s) ds « pn— — — - haiSii 1 )) 



+ P10 - l — — - 0--8i)g(Si(l)), 



f g(s)p(s\bi) ds w Pn — Y Sig(Si(l)) 



AeIC v ,Bi=bi u i i£ic v ,Bi = 

+ Pl0 = 1 E (1-^(^(1))- 

lrfeiCv l B i =6,U ~ 0i > ieic v ,Bi=b, 
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